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Abstract 

In  this  paper  we  discuss  questions  related  to  reliability  or  variability  of 
estimated  parameters  in  deterministic  least  squares  problems.  By  viewing  the 
parameters  for  the  inverse  problem  as  realizations  for  a  random  variable  we  are 
able  to  use  standard  results  from  probability  theory  to  formulate  a  tractable 
probabilistic  framework  to  treat  this  uncertainty.  We  discuss  method  stability 
and  approximate  problems  and  are  able  to  show  convergence  of  solutions  of 
the  approximate  problems  to  those  of  the  original  problem.  The  efficacy  of 
our  approach  is  demonstrated  in  numerical  examples  involving  estimation  of 
constant  parameters  in  differential  equations. 
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1  Introduction 


A  standard  deterministic  inverse  problem  frequently  encountered  in  both  applied 
and  theoretical  literature  can  be  abstractly  stated  as  follows:  Given  a  parameter 
dependent  dynamical  or  algebraic  system 

M<h  «(?))  =  3{<i)  (!) 

with  states  u(q),  parameters  q  and  operators  (differential  or  algebraic)  A,  use  obser¬ 
vations  (possibly  incomplete)  or  data  on  the  states  to  determine  the  best  parameters 
q*  in  some  admissible  set  Q  so  that  the  solution  of  equation  (1)  for  q  =  q*  best  de¬ 
scribes  the  data.  For  such  deterministic  problems  there  is  a  large  literature  based  on 
diverse  formulations  (least  squares,  equation  error,  etc.).  For  discussions  of  some 
of  these  see  [3].  Once  one  has  “solved”  this  (by  no  means  trivial)  deterministic 
problem,  it  is  frequently  important  to  know  something  about  the  reliability  of  the 
estimates.  One  approach  entails  attaching  “error  bars”  to  the  estimated  parameter 
values,  much  like  one  does  in  standard  statistical  analysis  or  in  scientific  computa¬ 
tional  analysis  (using  a  priori  bounds)  with  finite  discretization  techniques  (finite 
difference,  finite  elements,  etc)  from  numerical  analysis.  In  essence  we  are  asking 
for  measurements  of  uncertainty  (inherent  in  our  methods  rather  than  in  our  data 
collection)  related  to  our  best  estimates  of  parameters  q.  Thus  we  are  led  in  a 
completely  natural  way  to  stochastic  or  probabilistic  aspects  of  estimates  from  a 
deterministic  problem  solved  with  deterministic  algorithms. 

We  offer  here  ideas  for  one  approach  to  treatment  of  variability  in  parameter 
estimation  techniques.  The  approach  is  based  on  viewing  multiple  observations 
{hflyLi  of  the  state  in  (1)  as  observations  corresponding  to  a  set  of  realizations 
{SjljLi  °f  the  parameter  q  which  is  now  thought  of  as  a  specific  (albeit  unknown) 
random  variable  with  probability  distribution  P  on  Q.  The  system  (1)  is  accord¬ 
ingly  reformulated  in  terms  of  the  state  u  =  u(P)  depending  on  the  probability 
distribution  P.  The  observations  {«j}  can  be  averaged  so  that  it  =  is 

an  observation  for  u(P)  and  one  can  then  attempt  to  estimate  a  best  distribution 
P*  to  fit  this  data  u,  for  example  in  some  type  of  least  squares  fit.  Once  one  ob¬ 
tains  P* ,  its  mean  pi  and  variance  a2  can  be  used  as  a  best  parameter  estimate  and 
measure  of  reliability,  respectively. 

In  the  sections  below  we  give  a  concrete  example  of  this  (using  nonlinear  pa¬ 
rameter  dependent  ordinary  differential  equations  for  the  system  (1)).  We  present 
a  precise  formulation  of  this  conceptual  approach,  show  that  fundamental  results 
from  probability  theory  can  be  used  to  develop  well-posedness  results  (existence, 
continuous  dependence,  and  method  stability)  along  with  approximation  ideas  that 
are  computationally  tractable.  We  demonstrate  feasibility  of  the  resulting  algo¬ 
rithms  by  presenting  a  summary  of  numerical  findings  using  an  example  arising  in 
estimation  of  effectiveness  of  vaccination  policies  in  a  population  of  susceptibles  in 
disease  prophylactics. 
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We  believe  the  underlying  philosophy  as  well  as  the  specific  formulation  are 
applicable  to  and  will  be  useful  in  a  wide  class  of  practical  applications. 


2  Parameter  Estimation  in  Nonlinear  Systems 

To  demonstrate  our  ideas,  we  will  examine  the  estimation  of  constant  parameters 
in  a  system  of  ordinary  differential  equations.  These  ideas  can  be  easily  extended 
to  many  systems  of  interest  in  applications,  including  systems  of  partial  differential 
equations  with  unknown  functional  parameters  (e.g.,  time  and/or  spatially  depen¬ 
dent  coefficients). 

A  typical  estimation  problem  employs  observations  x  =  {ah'}/=1  for  x(ti),  i  = 
1,  2,  .  .  .,  n,  to  estimate  parameters  q  £  Mm  in  the  vector  dynamical  system 

i(t)  =  f(t,x(t),q).  (2) 

Often  a  least  squares  formulation  is  used  to  find  a  best  parameter  value  q*  in  some 
admissible  parameter  set  Q  C  Mm.  In  other  words,  we  attempt  to  find  q*  £  Q  which 
is  a  minimum  for 

n 

J{q,x)  —  'y  |  x(t{,  q')  —  X{  | 

8  =  1 

over  q  £  Q  where  x(tp,  q)  is  a  solution  of  (2)  for  a  given  q  £  Q. 

To  introduce  uncertainty,  we  view  the  parameters  as  realizations  for  a  random 
variable  and  use  the  data  to  estimate  the  probability  distribution  function  (PDF) 
for  this  random  variable.  Specifically,  let  J/Q)  denote  the  set  of  probability  distri¬ 
butions  on  Q  and  treat  the  data  {£i}  as  observations  for  the  expected  value 

Z[x(ti;q)\P]=  f  x(ti\  q)dP(q)  (3) 

Jq 

for  a  given  PDF  P  £  CP(<3).  Note  that  if  P  is  a  discrete  PDF  with  atoms  {qj  }fLi  C  Q 
and  associated  probabilities  {pj},  Pj  >  0,  ’f2j=iPj  =  1,  then  (3)  can  be  written 

.  M 

/  x(ti]q)dP(q)  =  y2x(ti;qj)pj. 

JQ 

Regardless  of  the  form  of  P ,  the  least  squares  estimation  problem  can  be  described 
as  finding  P*  £  J/Q)  to  minimize 


J(P)  =  Y,  I£[^;?)l^]-^l2  (4) 

2  =  1 
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over  P  G  CP(Q).  To  develop  theoretical  and  computational  results  for  this  problem, 
it  is  necessary  to  have  a  topology  on  CP(Q),  continuity  of  the  function  P  —>■  J(P) 
in  this  topology,  compatible  compactness  results,  and  some  approximation  results 
leading  to  implementable  computational  algorithms.  In  order  to  address  these  issues 
we  will  introduce  the  Prohorov  metric  and  summarize  some  results  from  Billingsley 
[5], 

3  The  Prohorov  Metric  in  the  Space  of  Probability 
Distributions 

Let  CP(Q)  be  the  set  of  probability  measures  on  the  Borel  subsets  of  Q,  where  Q  is 
any  complete  metric  space  with  metric  d.  For  any  closed  subset  F  C  Q  and  e  >  0, 
we  define  an  e-neighborhood  of  F  by 

Fe  =  {q  E  Q  ■  d(q,q)  <  e,  q  G  F}. 

We  then  define  p  :  CP(Q)  x  CP(Q)  — >■  M+  by 

p(P1,P2)  =  inf{e  >  0  :  P^F]  <  P2[FC }  +  e,  F  closed,  F  C  Q}. 

The  following  properties  of  p  are  well-known: 

(a)  p  is  a  metric  (called  the  Prohorov  metric)  on  CP(Q); 

(b)  (CP (Q),p)  is  a  complete  metric  space; 

(c)  if  Q  is  compact,  then  (CP (Q),p)  is  a  compact  metric  space. 

We  would  like  to  understand  convergence  of  Pk  —>■  P  in  the  p  metric.  Unfortunately, 
the  Prohorov  metric  is  neither  intuitive  nor  easy  to  use  directly.  However,  it  is  well 
known  that  if  ( Q,d )  is  a  complete  metric  space  and  (CP (Q),p)  is  defined  as  above, 
then  for  Pj.,  P  G  CP(Q),  the  following  convergence  statements  are  equivalent: 

(i)  p(Pk,P)^  0; 

(ii)  fg  fdPk(q)  —>■  fg  fdP(q)  for  all  bounded,  uniformly  continuous  f  :  Q  —>  M1; 

(iii)  Pk[A]  —>■  P[A ]  for  all  Borel  sets  A  C  Q  with  P[dA]  =  0,  where  dA  denotes 
the  boundary  of  A. 

The  equivalence  of  (i)  and  (iii)  reveals  that  convergence  in  the  p  metric  is  equiv¬ 
alent  to  convergence  in  distribution.  Moreover,  if  we  consider  CP (Q)  C  C'g(Q)  where 
C'b(Q)  denotes  the  space  of  bounded,  continuous  functions  on  Q  with  the  supre- 
mum  norm,  then  (i)  and  (ii)  imply  that  convergence  in  the  p  topology  is  equivalent 
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to  weak*  convergence  in  ‘J'(Q).  For  our  discussions,  we  will  make  critical  use  of  the 
equivalence  between  p(Pk,  P)  —>  0  and 

/  x(t;q)dPk(q)  ->  f  x(t;  q)dP(q), 

Jq  Jq 

which  in  turn  is  the  same  as 

Z[x(t;q)\Pk]  ->■  Z[x(t;q)\P]. 

This  convergence  is  needed  to  establish  continuity  in  the  p  topology  of  the  map 

n 

p^J(p)  =  J2  m^;q)\P}-^\2- 
8  =  1 

Continuity  of  this  map,  along  with  the  compactness  of  Q,  which  guarantees  com¬ 
pactness  of  ‘J'(Q)  in  the  p  metric,  is  sufficient  to  establish  existence  of  a  solution  to 
the  problem  of  minimizing  (4)  over  ‘J'(Q). 

If  we  assume  existence  questions  are  answered,  and  turn  to  the  task  of  character¬ 
izing  and/or  finding  minimizers,  we  note  that  ‘J'(Q)  with  the  p  metric  is  in  general  an 
infinite  dimensional  space  because  in  general  Q  will  be  infinite  dimensional.  Thus, 
to  address  computational  issues  one  must  consider  approximation  ideas.  To  do 
this  we  first  prove  a  density  theorem  that  will  be  useful  in  establishing  continuous 
dependence  of  estimates  on  data  as  well  as  in  constructing  approximation  schemes. 

There  are  numerous  topologies  on  ‘J'(Q).  If  we  define  a  W-neighborhood  of  P 
as  NS(P)  =  {Pi  G  0>(Q)  :  P^Fi)  <  P{F{)  +  6,  i  =  1,  •••,*,  F{  closed,  F{  £  §} 
for  a  given  6  >  0  and  finite  set  {F{}f=1,  this  induces  a  topology  on  ‘3>{Q)  which 
is  equivalent  to  the  topology  of  weak  convergence,  W  (see  [5],  p.  236).  We  can 
also  define  a  p-neighborhood  of  P  by  Ne(P)  =  {.Pi  £  J’(Q)  :  p(P,Pi)  <  e}  for  a 
given  e  >  0.  If  Q  is  a  separable  space,  the  W  topology  is  equivalent  to  the  topology 
induced  by  the  p-neighborhoods.  We  will  be  using  the  equivalence  of  these  two 
topologies  in  the  proof  of  Theorem  3.1.  Here  N+  are  the  positive  integers,  fR  are 
the  rational  numbers,  and  bqj  is  the  Dirac  measure  with  atom  at  qj . 

Theorem  3.1  Let  Q  be  a  complete,  separable  metric  space  with  metric  d,  §  be 
the  class  of  all  Borel  subsets  of  Q  and  1P(Q)  be  the  space  of  probability  measures 
on  (Q,§).  Let  Q o  =  {qj}<j<L1  be  a  countable,  dense  subset  of  Q.  Then  the  set  of 
P  E  J’(Q)  such  that  P  has  finite  support  in  Q o  and  rational  masses  is  dense  in 
1P(Q)  in  the  p  metric.  That  is, 

k  k 

J’o(Q)  =  {P  e  ? (Q) :  p  =  ^PjSq,,  k  e  N+,  qj  e  Q0,  pj  e  %  pj  >  o,  ^pj  =  1} 

t=i  t=i 

is  dense  m  1P(Q)  relative  to  p. 
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Proof:  Let  e  >  0  and  let  P  E  J’(Q).  Let  Ne(P)  be  a  p-neighborhood  of  P.  Since 
Q  is  separable,  the  W  and  p  topologies  are  equivalent.  Thus  there  is  a  6  >  0  such 
that  Ns(P)  C  NC(P),  where  Ns(P)  is  a  W-neighborhood  of  P  of  the  form  described 
above  with  closed  sets  F\,  ■  ■  ■ ,  F^. 

Let  be  the  partition  of  (J*=1  Fi  C  Q  generated  by  the  closed  sets 

F\,  ■  ■  ■ ,  Fk-  We  will  assume  each  Bi  is  non-empty  and  so  M  <  oo.  Since  Q o  is 

a  dense  subset  of  Q,  Bi  fl  Qo  0,  for  i  =  1,  •  •  • ,  M . 

For  i  =  1,  •  •  • ,  M ,  select  a  point  X{  E  Bi  fl  Qo-  At  each  point,  Xi,  place  a  mass, 
bi,  which  satisfies  the  following  three  conditions:  i)  bi  E  1R,  ii)  0  <  bi  <  P(Bi),  and 
iii)  | P(Bi)  -b{  |  < 

Now  if  U f=iF{  y^  Q,  select  a  point  xm+ i  so  that  xm+ i  E  Qo  f~l  (U \=1Fi)c .  If 
\jf=1F{  =  Q,  choose  xm+ i  so  that  xm+ i  E  Qo  \  ({xi}iLi)-  In  either  case,  place  at 
xm+i  a  mass  &m+i  =  1  —  Xlfci  Note  that  &m+i  E  01  and  0  <  &m+i  <  1. 

Dehne  P*  =  biSqi.  Then  P*(Q)  =  bi  =  1,  and  0  <  P*(A)  <  1  for 

all  A  E  §.  Thus  P*  E  T0 (Q)  . 

Dehne  Ki  =  {j  :  Fi  fl  Bj  y^  0,  1  <  j  <  M}.  Note  the  set  Ki  has  at  most  M 
indices. 

Now  suppose  \jf=1F{  y^  Q.  Then  for  any  F{, 


\P*(Fi)-P(Fi)  | 


=  bj)\ 

jeKi  jeKi 

=  I  E  P*(B:)~  E 

jeK,  jeK, 

=  i 

jGif,  j  £Kt 

=  iE^'-^')]i 

jeK , 

jeKt 

S 

<  ^  2M 


<  6. 


Now  suppose  uf=1Ti  =  Q.  If  *m+i  ^  Fi,  the  above  argument  shows  \P*(Fi )  — 
P(Fi) |  <  (5.  If  *m+i  E  Fi,  then 


5 


\P*(Fi)-P(Fi)  | 


< 

< 

< 

< 


\p*(  U  Bi)-p(  U  B>)\ 

jeK,  jeK, 

i  E  P*W)~  E  p(B>)\ 

jeK,  jeK, 


|  E  bj+bM+1-  E  PW)\ 

jeKt  jeKt 


M 


i  E^-^')]  +  [i-E^i 

jeK,  j= 1 


M  M 

i  E^-^')]  +  E^')-E^i 


jeK, 


j= i  i=i 

M 


i  E^-^')]  +  E[^)-mi 


jeK, 


j= i 
M 


E  i^-moi  +  Ei^)-^! 

jeK,  j= i 


E  — 

^  2M 

jeK, 

8  8 
2  +  2 

8. 


M 


E 


,5 

2M 


Thus  for  all  T;,  i3*^)  <  P(i^)  +  5,  so  i3*  £  ^(i3). 

Since  N$(P)  C  Ne(P),  P*  G  Ne(P).  By  construction  i3*  G  !Po(<3),  so  lPo(<5)  is 
dense  in  J’(Q)  relative  to  p.  □ 


Theorem  3.1  can  be  used  as  a  basis  for  defining  a  class  of  approximating  sets 
to  be  used  in  tractable  computational  methods  for  the  inverse  problems  defined  in 
Section  2.  First  define 

OO 

Qd  =  U  Qm  (5) 

M  =  1 
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where  Qm  =  {<zjtf}j=i;  M  =  1,2,  -  •  are  chosen  so  that  Qd  is  dense  in  Q.  Note 
that  Qd  is  countable.  For  each  positive  integer  M  let 

M  M 

vm(q)  =  {Pe  nQ)  ■■  p  =  J2pj6*r’  e  Qm *  v e  pj  ^  °>  12 pj  = !>• 

i=i  i=i 

(6) 

If  we  then  define 

MQ)  =  uS=i  ^(Q),  (7) 

then  by  Theorem  3.1  we  know  ‘J'd(Q)  is  dense  in  CP(Q),  and  so  we  can  approximate 
any  element  P  £  CP(Q)  by  a  sequence  {Pm,},  Pm ,  £  J* M,(Q),  such  that  p{PMj,  P) 

0  as  Mj  —>■  oo. 

4  Stability  of  the  Inverse  Problem 

We  now  turn  to  the  study  of  the  inverse  problem.  We  return  to  our  original  problem 
of  finding  a  solution  to 


min  J(P,  x)  =  ’Y]  \x(t{,  P)  -  x{\2.  (8) 

Fe'HQ) 

Given  data  xk  and  x  such  that  xk  x  as  k  — >■  oo  and  corresponding  solutions 
P*(xk)  and  P*(  x)  (which  in  general  are  sets  because  there  is  not  necessarily  a 
unique  minimizer  of  (8)),  we  say  the  problem  is  continuously  dependent  on  the  data 
(or  stable )  if  dist(P*(xk),P*(x ))  — >■  0  as  k  — >■  oo  (see  [3,  4]  for  detailed  discussions 
and  motivation). 

We  now  define  a  series  of  approximate  problems.  Let  CP M (Q)  be  defined  as  in 
(6)  where  Qd  is  a  countable  dense  subset  of  Q  as  defined  in  (5)  with  Qm  =  {df1}- 
We  define  the  approximate  problem  as  finding  a  solution  to 

n 

min  J(PM,x)  =  y^\x(ti,PM)  -  Xi |2.  (9) 

Pm£'Pm(Q) 

Let  P*M{x)  denote  the  set  of  solutions  for  a  given  x.  The  problems  are  method  stable 
(again,  see  [3,  4]  for  further  discussions)  if  for  any  data  xk  and  x  such  that  xk  — >■  x 
as  k  — >■  oo  we  have  dist(P^(xk),  P*(x))  — >■  0  as  k  — >■  oo  and  M  — >■  oo.  Note  that 
this  is  equivalent  to  requiring  dist(P^(xk),  P£j(x))  — >■  0  as  k  — >■  oo  uniformly  in  M . 

Theorem  4.1  Let  Q  be  a  compact  metric  space  and  assume  solutions  x(t;q)  of  (2) 
are  continuous  in  q  on  Q.  Let  CP (Q)  be  the  set  of  all  probability  measures  on  Q 
and  let  Qd  be  a  countable  dense  subset  of  Q  as  defined  in  (5)  with  Qm  =  {qf1  }fL1- 
Define  CP d(Q)  as  in  (7)  where  CP M(Q)  is  defined  as  in  (6).  Suppose  Pfj(xk)  is  the  set 
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of  minimizers  for  J (P)  over  P  £  J 'M (Q)  corresponding  to  the  data  {xk}  and  P*(x) 
is  the  set  of  minimizers  over  P  £  V(Q)  corresponding  to  {x}  where  xk ,  x  £  Mn  are 
the  observed  data  such  that  xk  — >■  x.  Then  dist(P^(xk),  P*(x))  — >■  0  as  M  — >■  oo  and 
xk  — >■  x.  Thus  the  solutions  depend  continuously  on  the  data  and  the  approximate 
problems  are  method  stable. 

Proof:  Since  Q  is  a  compact,  separable  metric  space,  Qd  is  dense  in  Q  and  TdiQ) 
is  the  space  of  all  probability  measures  with  finite  support  in  Qd  and  rational  masses, 
it  follows  from  Theorem  3.1  that  TdiQ)  is  a  dense  subset  of  1P(<5) - 

Since  Q  is  compact,  (CP(<3),  p)  is  compact,  where  p  is  the  Prohorov  metric.  Since 
q  —>■  x(t,  q)  is  continuous  from  Q  to  Mn,  whenever  Pm  P  in  !P(<3)  we  have  that 

n 

lim  J(PM)  =  lim  V] \£.[x(ti,  g)|PM]  -  *i|2 

M— >oo  M— ►oo 

2  =  1 

n  (10) 

=  \£.[x(tt,q)\P]-xt\2 

2  =  1 

=  HP)- 

Since  3* M  (Q)  is  a  closed  subset  of  1P(<5) ,  (Q)  is  compact  in  the  p  topology.  More¬ 

over,  J  is  a  continuous  function  on  the  compact  set  3* M  (Q)  and  thus  for  a  given  set 
of  observations  xk  =  {xk}?=i  £  IRn,  there  exists  a  (not  necessarily  unique)  mini- 
mizer  Pfjf  which  is  a  solution  to  the  problem  of  minimizing  Jk(P )  over  P  £  J* M  (Q) 
where 

n 

Jk(p)  =  J(P,  xk)  =  Y,  I £[*(*•■>  q)\P ]  -  H\2- 
2  =  1 

Let  { xk }  be  a  sequence  that  converges  to  some  arbitrary  x  £  Mn.  Let 
denote  the  set  of  minimizers  of  Jk(P )  over  J* M(Q).  For  k  =  1,2,  -  -  -  and  M  = 
1,2,  -  -  -,  let  {Pf^},  Pm  £  Pm(H),  be  any  sequence  of  minimizers  in  J’(Q).  By 
compactness  there  exists  a  convergent  subsequence  {P^f}  such  that 

lim  p*Mk:  =  p£  HQ)  (11) 

Me,kj->-  oo  * 

in  the  p  metric. 

First  note  that  for  any  Pm,  £  J* Mt(Q) 

Hpp*Mk:)  <  H3(pm,)-  (i2) 
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Then  by  definition  of  Jkj(P^[J),  (10),  and  (11) 


Km  Jk*(p$)  =  Km 


„k3  1 2 


>oo 


2=1 


=  I£[*(ooi-P]  -  £i 


(13) 


=  J(P)- 


Let  P  be  any  element  of  ‘J'(Q).  Since  lPd(Q)  is  dense  in  ‘J'(Q)  we  can  find  a 
sequence  {Pm,},  Pm,  G  3* m,(Q),  so  that  Pm,  P  in  the  p  metric  as  Mi  — >■  oo . 
Then  by  dehnition  of  ,Jk}PM,),  and  (10) 


lim 

,Me-> 


Jk}PM ,) 


So  (12),  (13),  and  (14)  gives 


,  I™  l£[*(0  }\PM,\  ~  TT 

kj,Mi— >OQ 

2  =  1 
n 

^\}x(ti,q)\P]-  Xi\2 
2  =  1 
J(P). 

J(P)  <  J(P) 


(14) 


for  any  P  £  J’(Q).  Hence  P  is  a  minimizer  of  J{P)  over  P  £  J’(Q),  i.e. ,  P  £  P*(x). 

Thus  any  sequence  P}  in  P}(xk)  has  a  subsequence  P{^3  that  converges  to  a 
P  £  P*(x).  So  dist(P}}xk },  P*(x))  —>■  0  as  Mi  —>■  oo  and  kj  —>■  oo.  It  follows  that 
dist(P}(xk ) ,  P*(x))  — >■  0  as  M  — >■  oo  if  xk  — >■  i.  □ 


In  order  to  address  computational  issues,  we  use  the  family  of  approximate 
minimization  problems  defined  above.  If  Qd,  <PM(Q),  and  VdiQ)  are  defined  as 
in  Theorem  4.1,  we  know  from  Theorem  3.1  we  can  approximate  any  P  £  !P(<3) 
by  Pd  G  J’d(Q)-  Furthermore,  from  the  results  established  above  we  also  know  we 
can  approximate  any  P  £  ‘3>{Q)  by  distributions  Pm  £  J>M(Q).  By  choosing  M 
sufficiently  large  we  obtain 

.  ..  M  M 

/  x(t{ ,  q)  dP(q)  PS  /  x{ti,q)'Y'8¥{q)dP{q)  =  'Y'x{ti,qf)pj. 

JQ  Jq  j=1  3  j=1 

Thus  we  can  approximate  J (q)  by 

n  M 

jm(p )  =  \ii  ~  i2 

i= 1  J=1 
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where  p  =  (pi,p2,  ■  ■  ■  ,Pm)  and  J2j=i  Pj  =  1,  Pj  >  0,  Pj  G  X,  1  <  j  <  M. 
If  we  define 


X?  =  (x(U,  q?),x{U,q? x(tt,  q%))T,  XM  =  [X? ,  ■  ■  ■  ,  X?] 

X  -  (^1  7***7  &n) 7 


then  we  can  write 


JM(p)  =  Y,\^-p-X\2  =  \\x-p-xM\\l 

8  =  1 

The  approximate  minimization  problem  thus  reduces  to  a  constrained  optimiza¬ 
tion  problem  for  a  quadratic  cost  functional.  Such  problems  are  amenable  to  a 
number  of  standard  algorithms. 

Thus  we  see  that  any  solution  p  must  satisfy  p  ■  XM  =  X  and  that  if  XM  is 
nonsingular,  p  is  uniquely  determined  and  depends  continuously  on  the  data  X. 


5  Examples 


We  present  a  series  of  examples  to  illustrate  a  computational  algorithm  arising 
from  the  discussions  of  the  previous  sections.  Each  example  uses  the  same  system 
motivated  by  a  problem  in  the  assessment  of  the  efficiency  of  a  vaccination  program 
[6,  2]  by  using  data  {iq}  for  the  aggregate  population  x(t{)  of  vaccinated  but  not 
infected  individuals  at  time  t{.  The  evolution  of  the  population  is  given  by 


x(t)  =  —qG(t)x(t),  *(0)  =  xq, 


(15) 


where  G(t)  represents  the  known  rate  of  exposure  to  infection,  q  is  the  suscepti¬ 
bility  to  “environmental  exposure”  (a  parameter  to  be  chosen  from  an  admissible 
parameter  set  Q )  subsequent  to  vaccination  of  the  population  at  time  t  =  0,  and  xo 
is  the  known  number  of  individuals  initially  vaccinated. 

All  calculations  were  carried  out  using  MATLAB  routines.  We  let  Q  =  [0,  1]  and 
define  Qm  =  {  m~-i  }fLi  ■  Note  Qd  =  U m=iQm  is  a  countable,  dense  subset  of  Q. 
For  a  given  positive  integer  M,  we  would  like  to  find  a  P*  that  is  a  solution  to  (9) 
where  (Q)  is  defined  as  in  (6)  with  our  given  Qm-  The  data  {£{}  is  simulated 
data  we  generate. 

To  generate  simulated  data  we  start  by  taking  N  samples,  on  q*  from 

Q.  The  time  interval  T  =  [0,  1]  is  discretized  by  t{  =  ((,  0  <  i  <  n,  where  T  =  0.01. 
“Data”  {£{}  is  generated  by  first  solving  (15)  at  each  point  x(t{,  q?)  with  G(t)  given 
by 


G(t)  =  130  x 


0  0  <  t  <  0.1 

0.1  <  t  <  0.9 
1  0.9  <  t  <  1  . 
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Then,  to  average  solutions  from  all  samples,  we  define  Xi  =  1j)-  Rel- 

ative  random  noise  was  added  to  the  solutions  so  that  the  “data”  was  given  by 
X{  =  X{  [1  +  £i]  where  the  are  independent  Gaussian  random  variables  with  mean 
zero  and  variance  a2. 

In  our  first  example,  the  samples  of  q*  are  chosen  from  a  normally  distributed 
random  variable  on  Q  with  mean  0.5  and  standard  deviation  0.1.  The  following 
figures  display  the  optimal  estimated  discrete  probability  densities  represented  by 
p  =  (pi,  •  •  •  ,Pm)  and  the  corresponding  probability  distributions  Pm  =  Tl!f=iPi^q3  ■ 
In  Figure  1  we  present  results  of  the  optimization  using  data  that  was  generated 
as  described  with  0%  relative  error.  In  Figure  2  the  data  was  generated  with  5% 
relative  error,  and  in  Figure  3  data  was  generated  with  10%  relative  error.  Note  in 
each  case  as  M  increases  the  probability  distributions  are  converging  in  the  Prohorov 
metric  as  guaranteed  by  the  theory,  while  the  discrete  densities  do  not  converge  in 
any  sense  on  Q  =  [0,1].  In  each  plot  of  the  probability  density,  the  “x” ’s  are  the 
actual  distribution  of  the  generated  data  for  q*  and  in  each  plot  of  the  probability 
distribution,  the  dashed  line  is  the  continuous  distribution  associated  with  q*  that 
the  discrete  distributions  are  attempting  to  approximate.  The  optimal  distributions 
are  graphed  with  piecewise  constant  solid  lines. 

In  addition  to  testing  the  inverse  problem  on  q*  with  a  Gaussian  distribution, 
we  also  carried  out  the  inverse  problem  with  M  =  9  and  0%  and  5%  relative  error 
for  the  following  q*  (  x(tp,  q*)  is  again  represented  by  a  1  x  M  vector  of  values  and 
the  distribution  for  q*  is  approximated  by  p  =  (p i,  •  •  •  ,pg))'. 


•  A  delta  function  at  0.5,  q*  =  0.5,  with  results  presented  in  Figure  4; 


•  Two  delta  functions  at  0.25  and  0.75,  q*(j) 
results  in  Figure  5; 


0.25 

0.75 


0  <  j  <  N/  2 
N/2  <j<  1 


with 


•  Two  skewed  delta  functions  at  0.15  and  0.6,  q*(j) 
with  results  in  Figure  6; 


0.15  0  <  j  <  77/3 
0.6  77/3  <  j  <  1  ’ 


•  A  uniform  distribution  on  [0.25,  0.75],  with  results  in  Figure  7; 

•  A  uniform  distribution  on  [0.1,  0.3]  U  [.55,  .75],  with  results  in  Figure  8; 


•  A  bimodal  Gaussian  distribution  with  one  mean  at  0.3  and  the  other  at  0.7 
and  standard  deviation  0.06,  with  results  in  Figure  9; 


•  A  right  skewed  distribution  with  mean  0  and  standard  deviation  0.2,  with 
results  in  Figure  10. 

In  each  example  the  estimated  probability  distribution  is  a  reasonable  approx¬ 
imation  of  the  continuous  distribution,  both  with  no  error  and  with  5%  relative 
error  on  the  data. 
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Figure  1:  Approximate  probability  densities  and  probability  distributions  for 
normally  distributed  and  0%  error  on  data. 
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Figure  2:  Approximate  probability  densities  and  probability  distributions  for  q* 
normally  distributed  and  5%  error  on  data,. 
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Figure  3:  Approximate  probability  densities  and  probability  distributions  for  q* 
normally  distributed  and  10%  error  on  data. 
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Figure  4:  Approximate  probability  density  and  probability  distribution  for  q*  =  0.5 
with  M  =  9  and  relative  errors  0%  and  5%. 
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Figure  5:  Approximate  probability  density  and  probability  distribution  for  q*  two 
delta  functions  with  M  =  9  and  relative  errors  0%  and  5%. 
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Figure  6:  Approximate  probability  density  and  probability  distribution  for  q*  two 
skewed  delta  functions  with  M  =  9  and  relative  errors  0%  and  5%. 
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Figure  7:  Approximate  probability  density  and  probability  distribution  for  q*  uni¬ 
formly  distributed  on  [.25,  .75]  with  M  =  9  and  relative  errors  0%  and  5%. 
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Figure  8:  Approximate  probability  density  arid  probability  distribution  for  q*  uni¬ 
formly  distributed  on  [.  1 ,  .3]  U  [.55,  .75]  with  M  =  9  and  relative  errors  0%  and 
5%. 
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Figure  9:  Approximate  probability  density  and  probability  distribution  for  q*  bi- 
normally  distributed  with  mean  0.3  and  0.7  and  standard  deviation  0.06  for  each 
with  M  =  9  and  relative  errors  0%  and  5%. 
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Figure  10:  Approximate  probability  density  and  probability  distribution  for  q*  right, 
skew  distributed  with  mean  0  and  standard  deviationO.2  with  M  =  9  and  relative 
errors  0%  and  5%. 
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6  Concluding  Remarks 

In  the  discussions  above  we  have  presented  one  approach  to  the  quantification  and 
computational  treatment  of  uncertainty  in  inverse  problems  of  a  least  squares  for¬ 
mulation.  By  treating  the  estimated  parameter  as  a  random  variable  with  unknown 
distribution,  we  conceptually  reformulate  the  deterministic  parameter  estimation 
problem  into  a  problem  of  estimation  of  a  random  variable  using  sampled  data 
from  a  dynamical  system  which  depends  on  the  parameter.  We  use  powerful  but 
basic  results  from  probability  theory  to  develop  a  theoretical  basis  for  these  new 
problems.  Approximation  results  along  with  continuous  dependence  of  estimates 
on  data  and  method  stability  are  discussed.  To  test  the  ideas  we  present  a  series 
of  numerical  examples  based  on  a  single  model  arising  in  vaccination  and  suscep¬ 
tibility  problems.  The  computational  results  presented  support  the  efficacy  of  our 
approach  and  illustrate  well  the  theoretical  convergence  results  given  in  the  paper. 
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